Method and apparatus for compressing and decompressing sparse data sets

ABSTRACT

Embodiments of the present disclosure include a digital circuit and method for multi-stage compression. Digital data values are compressed using a multi-stage compression algorithm and stored in a memory. A decompression circuit receives the values and performs a partial decompression. The partially compressed values are provided to a processor, which performs the final decompression. In one embodiment, a vector of N length compressed values are decompressed using a first bit mask into two N length sets having non-zero values. The two N length sets are further decompressed using two M length bit masks into M length sparse vectors, each having non-zero values.

BACKGROUND

The present disclosure relates generally to digital circuits andsystems, and in particular to a method and apparatus for compressionmultiplexing for sparse computations.

Many modern digital systems and applications are required to processlarge volumes of digital values. For example, artificial intelligenceapplications may be required to store (e.g., in memory) and process(e.g., perform mathematical operations) are huge arrays of digitalvalues representing activations or weights. However, in many cases suchlarge volumes of data may contain a large number of zero values.Computation of zero values is often an exception for processing and maybe skipped or otherwise ignored by a system.

Input data sets typically have zero values and non-zero values randomlydistributed over the data set with zero values typically representing acertain percentage (referred to as sparsity) of the total data set. ForAI accelerators and workloads, for example, sparsity is an increasinglyimportant feature that needs to be supported in hardware to achieveperformance speed-up. In particular, storing and retrieving data setsfrom memory constitutes a burdensome overhead for the system.

Embodiments described herein advantageously store compressed data inmemory to reduce memory bandwidth associated with reading data out ofmemory into a processor.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates a digital circuit according to an embodiment.

FIG. 2 illustrates a method according to an embodiment.

FIG. 3 illustrates an example compression and decompression techniqueaccording to an embodiment.

FIG. 4 illustrates an algorithm for compressing and decompressing dataaccording to an embodiment.

FIG. 5 illustrates another example compression and decompressiontechnique according to an embodiment.

FIG. 6 illustrates system for compressing and decompressing dataaccording to an embodiment.

FIG. 7 illustrates a simplified block diagram of an example computersystem used to execute code according to various embodiments.

DETAILED DESCRIPTION

Described herein is a hierarchical compression technique. In thefollowing description, for purposes of explanation, numerous examplesand specific details are set forth in order to provide a thoroughunderstanding of some embodiments. Various embodiments as defined by theclaims may include some or all of the features in these examples aloneor in combination with other features described below and may furtherinclude modifications and equivalents of the features and conceptsdescribed herein.

In some embodiments, features and advantages of the present disclosureinclude circuit techniques for compressing and decompressing sets ofdigital values to and from memory to advantageously reduce memory readand write times and increase memory bandwidth. The techniques describedherein have a wide range of uses, including use an artificialintelligence processors, for example.

FIG. 1 illustrates a digital circuit according to an embodiment.Features and advantages of the present disclosure include retrievingcompressed digital values from memory, decompressing the digital valuesusing a first decompression algorithm, and sending compressed digitalvalues to a processor for further processing. For example, in someembodiments, a processor may further decompress the digital values,while in some embodiments a processor may be able to operate oncompressed values. As mentioned above, the presently disclosedhierarchical approach reduces memory transaction time and allowsprocessors to retrieve data faster. In one embodiment, the presentdisclosure includes a memory 101 storing data comprising a plurality ofdigital values. In various embodiments, the data is compressed using amulti-stage (aka multi-level) compression algorithm. Accordingly, datamay be read from memory 101 and received by a decompression circuit 102,which performs a first decompression 114. Decompression circuit 102 mayreceive N non-zero digital values and produce two sets of digital valuesof length N, for example. Processor 103 may receive the results of thefirst decompression from circuit 102 and perform a second decompression115. In various embodiments, the present techniques may be usedadvantageously on sparse data sets as described in more detail below,and the first and second decompressions are part of a multi-stageinterdependent sparsity based compression algorithm, for example.

More specifically, in one example embodiment, memory 101 stores Nnon-zero digital values in a block 110, where N is a first integer(e.g., N=32). Further, the N non-zero digital values 110 may beassociated with a bit mask 111 specifying positions of the N non-zerodigital values. In this example, the bit mask is of length 2*N. It is tobe understood that bit masks described herein may use a variety oftechniques to specify the positions of NZ values, including variousforms of delta coding or positional coding, for example. Decompressioncircuit 102 receives the N non-zero digital values 110 and the 2*Nlength bit mask 111 and produce two N length sets of digital values 112and 113. Each of the two N length sets of digital values 112 and 113 maycomprise N/2 non-zero digital values from the N non-zero digital values.Positions of the N/2 non-zero digital values in each of the two N lengthsets of digital values 112 and 113 may be set based on the 2*N lengthbit mask, for example. Processor 103 receives the two N length sets ofdigital values and two M length bit masks (not shown), where M is asecond integer greater than N, and decompress the two N length sets ofdigital values into two M length sets of digital values each comprisingN/2 non-zero elements. M divided by N (M/N) may be a power of 2, forexample (e.g., 128/32=4=2²), as illustrated in the examples below.

FIG. 2 illustrates a method decompressing data according to anembodiment. At 201, N non-zero digital values and a bit mask of length2*N are received from a memory (e.g., in a decompression circuit). At202, the N non-zero digital values are decompressed using the 2*N lengthbit mask to produce two N length sets of digital values each comprisingN/2 non-zero digital values from the N non-zero digital values.Positions of the N/2 non-zero digital values in each of the two N lengthsets of digital values may be set based on the 2*N length bit mask, forexample. At 203, the two N length sets of digital values and two M*Nlength bit masks are received in a processor, where M is a secondinteger greater than N. The processor may decompress the two N lengthsets of digital values into two M length sets of digital values eachcomprising N/2 non-zero elements. However, as mentioned above, in someembodiments, a processor may be able to operate on compressed values.

FIG. 3 illustrates an example compression and decompression techniqueaccording to an embodiment. The present example illustrates onemulti-stage compression technique for storage data in a memory. In thisexample, two sets of data 301 and 302 each have M=128 digital values.For the first level compression, 16 (e.g., N/2=16) NZ digital values outof each of the 128 digital values are selected. The selected digitalvalues may be the largest 16 values in each data set, for example. Theresult is two data sets 304 and 305, which are referred to as“sparsified” (i.e., made sparse), each comprising 128 values of which 16in each set are NZ. The 128 length data sets may comprise 8-bitmantissas, a sign bit, and a shared exponent illustrated at 303 a-b, forexample. Additionally, first and second 128 length bit masks 308 and 309specifying the positions of the 16 NZ values in each set are generated.Next, two 32 length sets 306 and 307 are generated, one from each of the128 length sets 304 and 305. 32 length sets 306 and 307 each include 16NZ values from the corresponding 128 length sets.

For the second level compression, a 64 length bit mask 311 is generated.Bit mask 311 specifies the positions of the 16 NZ values in each 32length set 306 and 307. For example, a first half of a 64 bit bit maskmay include a ‘1’ in positions where a value is NZ in set 306 and asecond half of the 64 bit bit mask may include a ‘1’ in positions wherea value is NZ in set 307. Next, a 32 length set 310 of NZ values isgenerated from the NZ values in the two 32 length sets 306 and 307. The32 length set 310 of NZ values may be stored in a memory with the 64length bit mask 311 and the 128 length bit masks 308 and 309.

The multi-stage compressed data may be retrieved at much higher ratesthan uncompressed or less compressed data. For instance, the 32 lengthset 310 of NZ values, the 2*N length bit mask 311, and the first andsecond 128 length bit masks may be retrieved from memory and coupled toa 2^(nd) level decompression circuit 312. Decompression circuit 312decompresses the 32 length set 310 of NZ values into two 32 length setsthat each have 16 NZ values. The 64 length bit mask 311 is used for thesecond level decompression. Circuit 312 is referred to as “2^(nd) level”decompression because it decompresses the 32 length NZ values back intotwo 32 length values 306 and 307, which is the decompression associatedwith the 2^(nd) level compression described above. Finally, 32 lengthsets 306 and 307 and 128 length bit masks 308 and 309 (and sharedexponents 303 a-b) may be sent to a processor for 1^(st) leveldecompression, where the two 32 length sets are decompressed into two128 length sets 304 and 305, each having 16 NZ digital values, using thefirst and second 128 length bit masks 308 and 309.

FIG. 4 illustrates an algorithm for compressing and decompressing dataaccording to an embodiment. At 401, first and second M length sets ofdigital values are received. At 402, N/2 non-zero (NZ) digital valuesare selected from the first set and N/2 non-zero (NZ) digital values areselected from the second set. At 403, first and second M length bitmasks are generated. The M length bit masks specify the positions of theN/2 NZ values in each set selected at 402. At 404, two N length sets ofvalues are generated. Each N length set of values includes N/2 NZ valuesfrom the two M length sets. Steps 402-404 constitute a first levelcompression of the multi-stage compression in this example.

The following steps 405-406 constitute a second level compression of themulti-stage compression in this example. At 405, a 2*N length bit maskis generated that specifies the positions of the N/2 NZ values in each Nlength set. At 406, an N length set of NZ values is generated from theNZ values in the two N length sets. At 407, the N length set of NZvalues, the 2*N length bit mask, and the first and second M length bitmasks may be stored in a memory circuit.

The N length set of NZ values, the 2*N length bit mask, and the firstand second M length bit masks may be retrieved from memory at 408. At409, the 2^(nd) level decompression step occurs, wherein the N lengthset of NZ values are decompressed into two N length sets having N/2 NZvalues using the 2*N length bit mask. The 1^(st) level decompressionoccurs at 410, where the two N length sets are decompressed into two Mlength sets having N/2 NZ digital values using the first and second Mlength bit masks. The 1^(st) level compression may be performed by aprocessor, such as an artificial intelligence processor or otherprocessor configured to process sparse data sets (e.g., data sets with asignificant number of zero values where zero values are skipped and/orwhere NZ value processing is accelerated).

FIG. 5 illustrates another example compression and decompressiontechnique according to an embodiment. In various embodiments, digitalvalues being compressed may be represented in different formats. Forinstance, the example if FIG. 3 may represent digital data values usingan 8 bit mantissa, a sign bit, and a shared exponent. In the exampleshown in FIG. 5 , the values are represented as a 4 bit mantissa, a signbit, and a shared exponent. However, the same hardware may execute bothB and B/2 bit representations, where B is the bit length (B=8 or B/2=4)using substantially the same algorithm as illustrated below.

In FIG. 5 , two sets of data 501 and 502 each have M=128 digital values.For the first level compression, 16 (e.g., N/2=16) NZ digital values outof each of the 128 digital values are selected. However, in this casethe values are selected in adjacent pairs (“pair-wise) so that the totallength of the selected values is 8 bits, which advantageously allows thealgorithm (with slight modifications) to run on the same hardware. Theselected digital values may be the largest 16 values in each data set,for example. In another embodiment, pairs of values may be selectedbased on pairwise highest (absolute) value. It is to be understood thata variety of other algorithms are possible. The result is two sparsifieddata sets 504 and 505, each comprising 128 values of which 16 in eachset are pairs of NZ values 504 a and 505 a, for example. The 128 lengthdata sets may comprise 4-bit mantissas, a sign bit, and a sharedexponent illustrated at 503 a-b, for example. Additionally, first andsecond 128 length bit masks 506 and 507 specifying the positions of the16 NZ values in each set are generated. Each mask specifies locations ofthe pairs of bits in each sparsified data set. Because the maskspecifies pairs of bits, in some embodiments each mask may be half thelength of the vectors (e.g., N/2=128/2 =64) advantageously furthercompressing the data. Next, two 32 length sets 510 and 511 aregenerated, one from each of the 128 length sets 504 and 505. 32 lengthsets 510 and 511 each include 16 pairs of NZ values from thecorresponding 128 length sets.

For the second level compression, a 64 length bit mask 520 is generated.In some embodiments, bit mask 520 may also be reduced in length toreduce the number of stored bits. Bit mask 520 specifies the positionsof the 16 NZ values in each 32 length set 510 and 511. For example, afirst half of a 64 bit bit mask may include a ‘1’ in positions where avalue is NZ in set 510 and a second half of the 64 bit bit mask mayinclude a ‘1’ in positions where a value is NZ in set 511. Next, a 32length set 521 of NZ values is generated from the NZ values in the two32 length sets 510 and 511. The 32 length set 521 of NZ values may bestored in a memory with the 64 length bit mask 520 and the 64 length bitmasks 512 and 513.

The multi-stage compressed data may be retrieved at much higher ratesthan uncompressed or less compressed data. For instance, the 32 lengthset 521 of NZ values, the 64 length bit mask 520, and the first andsecond 64 length bit masks may be retrieved from memory and coupled to a2^(nd) level decompression circuit 550. Decompression circuit 550decompresses the 32 length set 521 of NZ values into two 32 length sets510-511 that each have 16 NZ values. The 64 length bit mask 520 is usedfor the second level decompression. Finally, 32 length sets 510 and 511and 64 length bit masks 512 and 513 (and shared exponents 503 a-b) maybe sent to a processor for 1^(st) level decompression, where the two 32length sets are decompressed into two 128 length sets 504 and 505, eachhaving 16 pairs of NZ digital values, using the first and second 64length bit masks 512 and 513.

FIG. 6 illustrates system for compressing and decompressing dataaccording to an embodiment. In this example, digital data values 610 and611 are compressed by a computer system 601 and loaded into memory 630,which resides on the same integrated circuit 602 as a processor 650 anddecompression circuit 650. Computer system 601 may perform the first andsecond level compression according to various embodiments illustrated inthe examples here. Computer system 601 may be a server, for example, andintegrated circuit 602 may be coupled to the computer system over alocal bus or network connection (e.g., Ethernet). Computer system 601may implement the first and second level compression in hardware (e.g.,a state machine, an ASIC, or FPGA), software, or as a combination ofhardware and software, for example. Input vectors of M data values 610and 611 are compressed into N length vectors 612 and 613, each with N/2NZ values, and M length bit masks 614 and 615 (or M/2 length bit masksfor values selected pairwise). Next, an N length vector 616 of NZ valuesis generated with a 2*N length bit mask 617. The multi-stage compresseddata 620 comprising the bit mask 617, vector of NZ values 616, and M (orM/2) length bit masks 618 and 619 are loaded into memory 630. The datavalues may then be read from memory 630 by processor 650. Retrieved datais received by decompression circuit 630, which performs the first leveldecompression described above. Processor 650 may perform the secondlevel decompression, for example.

Volumes of digital data values may be thusly compressed by a multi-stagealgorithm and loaded into memory 630 with corresponding reductions inmemory usage and memory write and read transactions, thereby reducingthe memory bandwidth used to move data into the processor, for example.In one embodiment, the data comprises neural network activations orneural network weights, and processor 650 is an Artificial Intelligence(AI) processor optimized for neural network computations, such asmultiplication, accumulation, and the like. An example processor may beoptimized for sparse computations, where zeros are ignored and only NZresults are processed by the multipliers, accumulators, or otherhardware resources to yield faster results, for example.

FIG. 7 illustrates a simplified block diagram of an example computersystem used to execute program code according to various embodiments. Insome embodiments, computer system 700 executes code to compress data,decompress data, or both as set forth herein. In another embodimentcomputer system 700 executes hardware description code to generate adecompression circuit and/or other portions of an integrated circuit toperform the techniques described herein. A hardware description language(HDL) is a specialized computer language used to describe the structureand behavior of electronic circuits, and most commonly, digital logiccircuits. HDL code may be executed on a computer system to generatedigital logic circuits, including circuits described herein. FIG. 7illustrates a simplified block diagram of an example computer system700, which can be used to implement the techniques described in theforegoing disclosure. In some embodiments, computer system 700 may beused to implement a control processor 702, for example. As shown in FIG.7 , computer system 700 includes one or more processors 702 thatcommunicate with a number of peripheral devices via a bus subsystem 704.These peripheral devices may include a storage subsystem 706 (e.g.,comprising a memory subsystem 708 and a file storage subsystem 710) anda network interface subsystem 716. Some computer systems may furtherinclude user interface input devices 712 and/or user interface outputdevices 714.

Bus subsystem 704 can provide a mechanism for letting the variouscomponents and subsystems of computer system 700 communicate with eachother as intended. Although bus subsystem 704 is shown schematically asa single bus, alternative embodiments of the bus subsystem can utilizemultiple busses.

Network interface subsystem 716 can serve as an interface forcommunicating data between computer system 700 and other computersystems or networks. Embodiments of network interface subsystem 716 caninclude, e.g., Ethernet, a Wi-Fi and/or cellular adapter, a modem(telephone, satellite, cable, ISDN, etc.), digital subscriber line (DSL)units, and/or the like.

Storage subsystem 706 includes a memory subsystem 708 and a file/diskstorage subsystem 710. Subsystems 708 and 710 as well as other memoriesdescribed herein are examples of non-transitory computer-readablestorage media that can store executable program code and/or data thatproduce circuits having the functionality of embodiments of the presentdisclosure.

Memory subsystem 708 includes a number of memories including a mainrandom access memory (RAM) 718 for storage of instructions and dataduring program execution and a read-only memory (ROM) 720 in which fixedinstructions are stored. File storage subsystem 710 can providepersistent (e.g., non-volatile) storage for program and data files, andcan include a magnetic or solid-state hard disk drive, an optical drivealong with associated removable media (e.g., CD-ROM, DVD, Blu-Ray,etc.), a removable flash memory-based drive or card, and/or other typesof storage media known in the art.

It should be appreciated that computer system 700 is illustrative andmany other configurations having more or fewer components than system700 are possible.

FURTHER EXAMPLES

Each of the following non-limiting features in the following examplesmay stand on its own or may be combined in various permutations orcombinations with one or more of the other features in the examplesbelow.

In one embodiment, the present disclosure includes a digital circuitcomprising: memory, the memory storing data comprising a plurality ofdigital values, wherein N non-zero digital values are stored in a block,where N is a first integer, the N non-zero digital values beingassociated with a first bit mask specifying positions of the N non-zerodigital values; a decompression circuit to receive the N non-zerodigital values and the first bit mask and produce two N length sets ofdigital values from the N non-zero digital values, wherein positions ofthe non-zero digital values in each of the two N length sets of digitalvalues are set based on the first bit mask; and a processor to receivethe two N length sets of digital values and two second bit masks, andprocess the two N length sets of digital values using the two second bitmasks.

In another embodiment, the present disclosure includes a method ofdecompressing data comprising: receiving, from a memory, N non-zerodigital values and a first bit mask specifying positions of the Nnon-zero digital values, where N is a first integer, and wherein the Nnon-zero digital values are stored in a block associated with the firstbit mask; decompressing the N non-zero digital values using the firstbit mask to produce two N length sets of digital values each comprisingnon-zero digital values from the N non-zero digital values, whereinpositions of the non-zero digital values in each of the two N lengthsets of digital values are set based on the first bit mask; andreceiving the two N length sets of digital values and two second bitmasks in a processor, where M is a second integer greater than N, andprocessing, by the processor, the two N length sets of digital valuesusing the second bits masks.

In another embodiment, the present disclosure includes amachine-readable medium storing a program executable by a computer, theprogram comprising sets of instructions for: receiving, from a memory, Nnon-zero digital values and a first bit mask specifying positions of theN non-zero digital values, where N is a first integer, and wherein the Nnon-zero digital values are stored in a block associated with the firstbit mask; decompressing the N non-zero digital values using the firstbit mask to produce two N length sets of digital values each comprisingnon-zero digital values from the N non-zero digital values, whereinpositions of the non-zero digital values in each of the two N lengthsets of digital values are set based on the first bit mask; andreceiving the two N length sets of digital values and two second bitmasks in a processor, where M is a second integer greater than N, andprocessing, by the processor, the two N length sets of digital valuesusing the second bits masks.

In one embodiment, the processor further decompresses the two N lengthsets of digital values using the two second bit masks into two M lengthsets of digital values, where M is a second integer greater than N.

In one embodiment, the data stored in the memory comprising theplurality of digital values is compressed using a multi-stagecompression algorithm.

In one embodiment, M divided by N is a power of 2.

In one embodiment, the first bit mask is at least of length 2*N and thetwo N length sets of digital values each comprise N/2 non-zero digitalvalues.

In one embodiment, the two M length bit masks are stored in said memorywith the N non-zero digital values and the bit mask of length 2*N.

In one embodiment, the first bit mask comprises 2*N bits.

In one embodiment, the two second bit masks each comprise M bits.

In one embodiment, the N non-zero digital values are stored in thememory as pairs of values.

In one embodiment, the two second bit masks each comprise M/2 bits.

The above description illustrates various embodiments along withexamples of how aspects of some embodiments may be implemented. Theabove examples and embodiments should not be deemed to be the onlyembodiments, and are presented to illustrate the flexibility andadvantages of some embodiments as defined by the following claims. Basedon the above disclosure and the following claims, other arrangements,embodiments, implementations and equivalents may be employed withoutdeparting from the scope hereof as defined by the claims.

What is claimed is:
 1. A digital circuit comprising: memory, the memorystoring data comprising a plurality of digital values, wherein Nnon-zero digital values are stored in a block, where N is a firstinteger, the N non-zero digital values being associated with a first bitmask specifying positions of the N non-zero digital values; adecompression circuit to receive the N non-zero digital values and thefirst bit mask and produce two N length sets of digital values from theN non-zero digital values, wherein positions of the non-zero digitalvalues in each of the two N length sets of digital values are set basedon the first bit mask; and a processor to receive the two N length setsof digital values and two second bit masks, and process the two N lengthsets of digital values using the two second bit masks.
 2. The circuit ofclaim 1, wherein the processor further decompresses the two N lengthsets of digital values using the two second bit masks into two M lengthsets of digital values, where M is a second integer greater than N. 3.The circuit of claim 1, wherein the data stored in the memory comprisingthe plurality of digital values is compressed using a multi-stagecompression algorithm.
 4. The circuit of claim 1, wherein M divided by Nis a power of
 2. 5. The circuit of claim 1, wherein the first bit maskis at least of length 2*N and the two N length sets of digital valueseach comprise N/2 non-zero digital values.
 6. The circuit of claim 5,wherein the two M length bit masks are stored in said memory with the Nnon-zero digital values and the bit mask of length 2*N.
 7. The circuitof claim 5, wherein the first bit mask comprises 2*N bits.
 8. Thecircuit of claim 1, wherein the two second bit masks each comprise Mbits.
 9. The circuit of claim 1, wherein the N non-zero digital valuesare stored in the memory as pairs of values.
 10. The circuit of claim 9,wherein the two second bit masks each comprise M/2 bits.
 11. A method ofdecompressing data comprising: receiving, from a memory, N non-zerodigital values and a first bit mask specifying positions of the Nnon-zero digital values, where N is a first integer, and wherein the Nnon-zero digital values are stored in a block associated with the firstbit mask; decompressing the N non-zero digital values using the firstbit mask to produce two N length sets of digital values each comprisingnon-zero digital values from the N non-zero digital values, whereinpositions of the non-zero digital values in each of the two N lengthsets of digital values are set based on the first bit mask; andreceiving the two N length sets of digital values and two second bitmasks in a processor, where M is a second integer greater than N, andprocessing, by the processor, the two N length sets of digital valuesusing the second bits masks.
 12. The method of claim 11, wherein theprocessor further decompresses the two N length sets of digital valuesusing the two second bit masks into two M length sets of digital values,where M is a second integer greater than N.
 13. The method of claim 11,wherein the N non-zero digital values are stored in the memory as pairsof values.
 14. The method of claim 11, wherein the data stored in thememory comprising the plurality of digital values is compressed using amulti-stage compression algorithm.
 15. The method of claim 11, whereinthe two M length bit masks are stored in said memory with the N non-zerodigital values and the bit mask.
 16. The method of claim 11, wherein Mdivided by N is a power of
 2. 17. The method of claim 11, wherein thefirst bit mask comprises 2*N bits.
 18. The method of claim 11, whereinthe two second bit masks each comprise M bits.
 19. The method of claim11, wherein the N non-zero digital values are stored in the memory aspairs of values.
 20. A non-transitory machine-readable medium storing aprogram executable by a computer, the program comprising sets ofinstructions for: receiving, from a memory, N non-zero digital valuesand a first bit mask specifying positions of the N non-zero digitalvalues, where N is a first integer, and wherein the N non-zero digitalvalues are stored in a block associated with the first bit mask;decompressing the N non-zero digital values using the first bit mask toproduce two N length sets of digital values each comprising non-zerodigital values from the N non-zero digital values, wherein positions ofthe non-zero digital values in each of the two N length sets of digitalvalues are set based on the first bit mask; and receiving the two Nlength sets of digital values and two second bit masks in a processor,where M is a second integer greater than N, and processing, by theprocessor, the two N length sets of digital values using the second bitsmasks.